large-scale image classification
Deep Fisher Networks for Large-Scale Image Classification
As massively parallel computations have become broadly available with modern GPUs, deep architectures trained on very large datasets have risen in popularity. Discriminatively trained convolutional neural networks, in particular, were recently shown to yield state-of-the-art performance in challenging image classification benchmarks such as ImageNet. However, elements of these architectures are similar to standard hand-crafted representations used in computer vision. In this paper, we explore the extent of this analogy, proposing a version of the state-of-the-art Fisher vector image encoding that can be stacked in multiple layers. This architecture significantly improves on standard Fisher vectors, and obtains competitive results with deep convolutional networks at a significantly smaller computational cost. Our hybrid architecture allows us to measure the performance improvement brought by a deeper image classification pipeline, while staying in the realms of conventional SIFT features and FV encodings.
Towards Transfer Learning for Large-Scale Image Classification Using Annealing-based Quantum Boltzmann Machines
Schuman, Daniëlle, Sünkel, Leo, Altmann, Philipp, Stein, Jonas, Roch, Christoph, Gabor, Thomas, Linnhoff-Popien, Claudia
Quantum Transfer Learning (QTL) recently gained popularity as a hybrid quantum-classical approach for image classification tasks by efficiently combining the feature extraction capabilities of large Convolutional Neural Networks with the potential benefits of Quantum Machine Learning (QML). Existing approaches, however, only utilize gate-based Variational Quantum Circuits for the quantum part of these procedures. In this work we present an approach to employ Quantum Annealing (QA) in QTL-based image classification. Specifically, we propose using annealing-based Quantum Boltzmann Machines as part of a hybrid quantum-classical pipeline to learn the classification of real-world, large-scale data such as medical images through supervised training. We demonstrate our approach by applying it to the three-class COVID-CT-MD dataset, a collection of lung Computed Tomography (CT) scan slices. Using Simulated Annealing as a stand-in for actual QA, we compare our method to classical transfer learning, using a neural network of the same order of magnitude, to display its improved classification performance. We find that our approach consistently outperforms its classical baseline in terms of test accuracy and AUC-ROC-Score and needs less training epochs to do this.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)
- Europe > Czechia (0.04)
Deep Fisher Networks for Large-Scale Image Classification
Simonyan, Karen, Vedaldi, Andrea, Zisserman, Andrew
As massively parallel computations have become broadly available with modern GPUs, deep architectures trained on very large datasets have risen in popularity. Discriminatively trained convolutional neural networks, in particular, were recently shown to yield state-of-the-art performance in challenging image classification benchmarks such as ImageNet. However, elements of these architectures are similar to standard hand-crafted representations used in computer vision. In this paper, we explore the extent of this analogy, proposing a version of the state-of-the-art Fisher vector image encoding that can be stacked in multiple layers. This architecture significantly improves on standard Fisher vectors, and obtains competitive results with deep convolutional networks at a significantly smaller computational cost. Our hybrid architecture allows us to measure the performance improvement brought by a deeper image classification pipeline, while staying in the realms of conventional SIFT features and FV encodings.
Problems in Large-Scale Image Classification
Guo, Yuchen (Tsinghua Univerisity)
The number of images is growing rapidly in recent years because of development of Internet, especially the social networks like Facebook, and the popularization of portable image capture devices like smart phone. Annotating them with semantically meaningful words to describe them, i.e., classification, is a useful way to manage these images. However, the huge number of images and classes brings several challenges to classification, of which two are 1) how to measure the similarity efficiently between large-scale images, for example, measuring similarity between samples is the building block for SVM and kNN classifiers, and 2) how to train supervised classification models for newly emerging classes with only a few or even no labeled samples because new concepts appear every day in the Web, like Tesla's Model S. The research of my Ph. D. thesis focuses on the two problems in large-scale image classification mentioned above. Formally, these two problems are termed as large-scale similarity search which focuses on the large scale of samples/images and zero-shot/few-shots learning which focuses on the large scale of classes. Specifically, my research considers the following three aspects: 1) hashing based large-scale similarity search which adopts hashing to improve the efficiency; 2) cross-class transfer active learning which simultaneously transfers knowledge from the abundant labeled samples in the Web and selects the most informative samples for expert labeling such that we can construct effective classifiers for novel classes with only a few labeled samples; and 3) zero-shot learning which utilizes no labeled samples for novel classes at all to build supervised classifiers for them by transferring knowledge from the related classes.
- North America > United States > Wisconsin > Dane County > Madison (0.05)
- Asia > China > Beijing > Beijing (0.05)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.61)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)